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3. ADMINISTRATIVE ADVICE (3B20S) 
GENERAL 


The information contained in this section pertains to the 3B208 Processor. For administrative advice rela- 
tive to DEC equipment, refer to tab section “ADMINISTRATIVE ADVICE (DEC)”. 


ADMINISTRATOR’S ROAD MAP 


This section contains administrative advice based on experience of many system administrators and their 
suggestions. Other reasonable approaches to solve many of the problem areas described may be taken. Getting 
started as a UNIX system administrator is hard work. There are no real shortcuts to a working knowledge of 
the system. The system administrator will need time for reading, studying, and hands-on experimenting. The 
system administrator should not go “live” with the system until he/she have had two weeks to learn the job 
and get the initial hardware quirks ironed out. 


Do not consign the “Setting up the UNIX System” section to oblivion after the initial system “gen”. In addi- 
tion to information needed whenever adding or changing equipment, the section contains valuable material 
about system tuning (buffers, clists, etc.) that appears nowhere else. 


The administrator should be familiar with a lot of the distributed documentation. The “Introduction”, and 
“How to Get Started” sections of the UNIX System User’s | Manual as well as all of the sections of the UNIX 
System Administrator’s Manual should be studied. 


Throughout this section, each reference of the form name(1M) name(7), or name(8) refers to entries in 
the UNIX System Administrator’s Manual. All other references to entries of the form name(N), where ‘“N” 


is a number (1through 6) possibly followed by a letter, refer to entry name in Section N of the UNIX System 
User’s Manual. 

In these manuals, pay special attention to: acet(1M), checkall(1M), chmod(1), chown(1), config(1M), 
cpio(1), date(1), deopy(1M), df(1M), don(1M), du(1), ed(1), env(1), errpt(1M), find(1), fsek(1M), 
fuser(1M), kill(1), mail(1), mkdir(1), mkfs(1M), ncheck(1M), ps(1), rm(1), rmdir(1), shutdown(1M), 
stty(1), su(1), syne(1M), time(1), voleopy(1M), wall(1M), who(1), nei write(l): acct(4); all of Section 7; 
and ecrash(8), dskfmt(8), and 3B20o0ps(8). 

CONFIGURATION GUIDELINES 


Minimum recommended configurations are shown in Table 3.A. 
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TABLE 3.A 


3B20S RECOMMENDED CONFIGURATIONS 


[eee [os [ee | 


DMAC Channels 


1 
MHD (disks) 2 3 
1 1 


UN52 (tape) [2 cae eae 1 
TN83 (console) 


DISK FREE SPACE 


Making files is easy under the UNIX operating system. Administratively, both free disk blocks and free 
inodes (UNIX system talk for file headers) can be a problem. If the free inode count falls below 100, the system 
spends most of its time rebuilding the free inode array. If a file system runs out of space, the system prints “no- 
space” messages and does little else. To avoid problems, the following start-of-day free counts should be main- 
tained: 


e The file system containing /tmp (temporary files): 


—16-user system: 1500 free kilobytes(KB). oe 
—40-user system: 3000 free KB. 


e The file system containing /usr. 
—3000 to 6000 free KB depending on load. 
e Other user file systems: 
—6 to 10 percent free depending on user habits B 


(3000 KB minimum). 


This brings up an associated problem: how big should file systems be? The preference is to set aside space 
on each drive for a copy of root/swap and use the rest of the pack for a single file system. However, if there ey 
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are user groups that fight over disk space, it may be better to split them up arbitrarily (i.e., divide a pack into 
more than one file system). 


Warning: If different disk drives are set up with differing cylinder partitions between file 
systems, it will eventually lead to an operational blunder. 


A VERY FEW WORDS ABOUT SYSTEM TUNING 


As shipped, the UNIX system has no programs with the text-bit mode set [see chmod(1)]. The top contend- 
ers for the t-bit are nroff and followed (generally) by the larger phases of the C compiler (including the assem- 
bler and loader). The t-bit is only meaningful with pure text programs [ld(1) options —i or —n]. Do not overdo 
it and keep t-bit programs in the root file system. 


A file system reorganization (described below) can help throughput but at the expense of down time. If the 
reorganization is done when the users are all asleep, it can help. 


If normal shutdown and filesave procedures are used, the file system check program [fsck(1M), —S option] 
will help keep the disk free list in reasonable order. Try to keep disk drive usage balanced. If there are over 
20 users, the root file system (/bin, /tmp, /etc, and swap) deserves a drive of its own. If there is a noisy modem 
(poorly executed do-it-yourself null-modem) or a disconnected modem cable, the UNIX system will spend a lot 
of CPU time trying to get it logged in. A random check of systems uncovers a lot of this going on. 
WHY A SPARE DISK DRIVE IS NEEDED 


Without a spare disk drive, the system will be down when a drive is down. Also, without the spare drive, 
it is difficult to reorganize file systems or to save and restore user files. 


DISK PACKS 


Only fully ECC (Error Correcting Code) correctable disk packs should be bought. The pack should be tested; 
and if uncorrectable errors develop, recondition the pack or get rid of it. 


Disk packs used with the UNIX system need not be totally error free, but must be “flag-free”. The term 
flag-free means that there should be no unrecoverable ECC, Technically, proper ECC handling can recover from 
11-bit error bursts. However, the length of bursts can grow as a pack ages. 

PROTECTING USER FILES 

Users, especially inexperienced ones, occasionally remove their own files. Open files are sometimes lost 
when the system crashes. Once in a great while, an entire file system will be destroyed (picture a disk controller 
that goes bad and writes when it should read). Here is a suggested file backup procedure: 

e Each day copy all user file systems to backup packs. Keep these packs 3 to 5 days before reusing them. 
e Once a week copy each file system to tape. Keep weekly tapes for 8 weeks. 


e Keep bimonthly tapes “forever” (they should be recopied once a year). 


The most recent weekly tapes should be kept off premises. The other tapes should be in a fireproof safe if 
available and not too expensive. 


When the UNIX system goes down, active files can get scrambled. The users will not want to start the day 
over every time the system fails. In addition to good backup, the system administrator must have file system 
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patching expertise available (on-site or on-call). If the system is ever rebooted for general use without first 
checking the file systems, terrible things will happen. (Once this caused five duplicate entries on a file system 
free list—this ruined over 100 new files within 8 days). Study checkall(1M), fsck(1M), and crash(8) as well 
as the “FILE SYSTEM CHECKING” section for more information. 


UNIX FILE SYSTEM BACKUP PROGRAMS 
The following backup programs are distributed: 


e Find/epio: The UNIX system is distributed in epio format. The —epio option of the find command can 
be used for saving only those files that have changed or been created over a definite period. 


e Volcopy: Physical file system copying to disk or tape. For those with a spare drive, voleopy to disk 
provides convenient file restore and quick recovery from disk disasters. Tape voleopy provides good 
long-term backup because the file system can be read-in fairly quickly, mounted, and browsed over. Disk 
and tape volcopy are generally used together for short- and long-term backup. Note that a voleopy 
from a mounted file system may result in an inconsistent copy (files being written at the time can con- 
tain invalid data). 


The spare disk drive is strongly recommended. The speed and convenience of voleopy are by no means the 
only advantage of a spare drive. It is strongly recommended that the administrator modify the /etc/filesave 
and /etc/checklist files to meet the operational needs and update the local operator’s manual accordingly. Re- 
member, the more the administrator automates and documents operational procedures the less downtime will 
be encountered. 


CONTROLLING DISK USAGE 


If the UNIX system is a success, disk space will soon become limited. During the long delay before more 
drives become available, usage should be controlled. Try to maintain the start-of-day counts recommended. 
Watch usage during the day by executing the df(1) command regularly. 


The du(1) command should be executed (after hours) regularly (e.g., daily), and the output kept in an acces- 
sible file for later comparison. In this way, users rapidly increasing their disk usage may be spotted. This can 
also be accomplished by running the accounting system’s acetdusg program [see acct(1M)] as shown in “THE 
UNIX SYSTEM ACCOUNTING?” section. 


The find(1) command can be used to locate inactive (or large) files. For example: 
—find / —mtime +90 —atime +90 —print >somefile 
records in somefile the names of files neither written nor accessed in the last 90 days. 


The administrator will also have to balance usage between file systems. To do this, user directories must 
be moved. Users should be taught to accept file system name changes (and to program around them—preferably 
ahead of time). The user’s login directory name (available in the shell variable HOME) should be utilized to 
minimize pathname dependencies. User groups with more extensive file system structures should set up a shell 
variable to refer to the file system name (e.g., F'S). 


The find(1) and epio(1) commands can be used to move user directories and to manipulate the file system 
tree. The following sequence is useful (it moves the directory trees userx and usery from file system filesys1 


to file system filesys2 where, presumably, more space is available): 


ed /filesysl 
find userx usery —print | cpio —pdm /filesys2 
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Make sure new copy is OK. 

Change userx and usery login directories 
in the /etc/passwd file. 

Notify userx and usery via mail(1) that 
they have been moved and that pathname 
dependencies in their .profile and shell 
procedures may need changed. See the 
discussion on $HOME above. 

rm —rf /filesysl/userx /filesysl/usery 


Se FR SR OS OSE HE SS 


When moving more than one user in this way, keep users with common interests in the same file system (these 
users may have linked files) and move groups of users who may have linked files with a single cpio command 
(otherwise, linked files will be unlinked and duplicated). 


ee REORGANIZING FILE SYSTEMS 


There is a file system reorganization utility called deopy(1M). On an otherwise idle system, a reorganized 
file system has almost twice the I/O throughput of a randomly organized file system. This applies to file copy- 
ing, finds, fscks, etc. Deopy can take up to 2.5 hours to initially reorganize (copy) a large file system. During 
reorganization, the system can be up, but the file system being copied must be unmounted. 


For those who can afford the operator time, root reorganization once a week (requires system reboot) and 
user file system reorganization once a month will improve system performance. Deopy is an interim step. 


KEEPING DIRECTORY FILES SMALL 


system user once complained that it took the system 10 minutes to complete the login process; it turned out that 
this user had a login directory 25K bytes long, and the login program spent that time fruitlessly looking for 
a nonexistent .profile file. A large /usr/mail, /usr/spool/uucp, or /usr/rje?/rpool directory can also really slow 
the system down. The following will ferret out such directories: 


e Directories larger than 5K bytes (320 entries) are very inefficient because of file system indirection. A UNIX 


find / —type d —size +10 —print 


Removing files from directories does not make the directories get smaller (the empty directory entries are 
available for reuse). The following will “compact” /usr/mail (or any other directory): 


mv /usr/mail /usr/omail 
mkdir /usr/mail 


chmod 777 /usr/mail 

aig ed /usr/omail 
find . —print ! cpio —plm ../mail 
ds 


rm —rf omail 
ADMINISTRATIVE USE OF “CRON” 
The program cron(1M) is useful in the administration of the system; it can be used to: 
& e Turn off the programs in directory /usr/games during prime time. 


e Run programs off-hours: 


& —accounting; 
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—file system administration; 
—long-running, user-written shell procedures 
[using the su(1) command], for example: 
su —userx userx_shell arg ... 


WATCH OUT FOR FILES AND DIRECTORIES THAT GROW 


Most of the below files are restarted automatically by entries in /etc/re at system reboot. 
e Administrative log files: 


e /etc/wtmp—login information; grows extremely fast with terminal line difficulties; use 
acctcon(1M) to determine the offending line(s). 


/usr/adm/pacct—per process accounting records; gets big quickly; monitored automatically by 
ekpacct from cron(1M). 


e /usr/adm/cronlog—status log of commands executed by eron(1M); also watch this file for error 
messages from the programs being executed in /usr/lib/crontab. 

e /usr/adm/errfile—hardware error logging info; also read login adm’s mail periodically. 

e /usr/adm/ctlog—a log of the people who use et(1C) command. 

e /usr/adm/sulog—a log of those who execute the superuser command. 


e /usr/adm/Spacct—process accounting files left over from an accounting failure; remove these 
files unless the accounting files that failed are to be rerun. & 


e Other files: 


e /usr/spool spooling directory for line printers, uuep(1C), etc., and whose subdirectories 
should be compacted as described above. 


e /usr/rje?/rpool—temporary storage place for print and punch jobs returning from the remote 
job entry facility; monitor RJE jobs returning huge amounts of output. 


e /usr/rje?/squeue—temporary storage area for jobs being submitted for transmission; watch 
out if RJE is down for a lengthy period. 


ALLOCATING RESOURCES TO USERS 


A prospective user should first obtain authorization to use the system and then apply for a login by provid- & 
ing the following information to the “system administrator”: 


e User’s name. 
e Suggested login name (not more than eight characters, beginning with a lowercase letter). 
e Relationships to other users (this influences the choice of the file system). & 


e Estimate of required file space (this also influences the choice of the file system) and connect hours. 
‘This aids in hardware growth planning. 


Users should be forced to have passwords not more than eight characters long (but more than five) and not 
in Webster’s Unabridged; passwd(4) explains how to do that as well as explaining password aging. wy 
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THE MATTER OF ACCOUNTING AND USAGE 


The system administrator should run the accounting programs even if there is not a “bill” for service. Other- 
wise, users’ habits (especially bad habits) will be a mystery to the administrator. Accounting information can 
also help find performance bottlenecks, unused logins, bad phone lines, etc. 


3 DIAL-LINE UTILIZATION 


If prime-time dial-line utilization gets much over 70 percent, users will start to encounter busy signals when 
dialing in. This, in turn, will lead to “line hogging”. The only solutions are to acquire more dial-in ports, get 
a larger (another) machine, or get rid of users. Manual policing will help some, but “automatic” policing will 
be invariably subverted by users. 


“BIRD-DOGGING”’ 


& When the system is busy (lines busy and/or slow response), someone should determine why this is so. The 
who(1) command lists the people logged in. The ps(1) command shows what they are doing. Unfortunately, ps 
operates from heuristics that can consistently fail to report certain processes in a busy system. That is, one must 
be careful about hanging up an apparently inactive line. The acctcom(1) command can read the process ac- 
counting file /usr/adm/pacct backwards from the most recent entry. It will print entries for selected lines or 
login names. 


TERMINALS 


Do not use uppercase only terminals. Use full-duplex, full-ASCII asynchronous terminals. Hardware hori- 
zontal tabbing is very desirable because it increases output speed and lowers system overhead. A fair proportion 
of the terminals should provide for correspondence quality hard copy output to take advantage of the UNIX 

& word processing capabilities; see term(5). 


LINE PRINTERS 
There are two classes of printers that may be interfaced with the 3B20 Processors. 


e Printers that work over an asynchronous link (DC1/DC3 protocol required, hardware tabs and charac- 
ter overstrike recommended). This class of printers may be able to operate up to 300 to 400 lines per 
minute and would interface to a TN74 or TN4 slot. 


e Printers that are equipped with a Dataproducts Long-line Interface, an Electronic Direct Access Verti- 
cal Format Unit, and a Trilog LAX logic board. This class of printers interface to the TN85 controller. 
(The TN85 can handle up to two printers with combined throughput up to 2000 lines per minute.) Cur- 
rently, only the P300 and P600 printers manufactured by Printronix Inc. are supported. They can be or- 
dered from a Western Electric sales representative. 


SECURITY 


The current UNIX operating system is not tamperproof. The system administrator can not keep people from 
“breaking” the system but can usually detect that they have done so. The following command will mail (to root) 
a list of all “set user ID” programs owned by root (superuser): 


& find / —user root —perm —4100 —exec ls —1 { } \; + mail root 
Any surprises in root's mail should be investigated. Related advice: 


e Change the superuser password regularly. Do not pick obvious passwords (choose six to eight character 
nonsense strings that combine alphabetics with digits or special characters). 
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e Dial ports that do not require passwords usually cause trouble. 


e The chroot(1M) and su(1) commands are inherently dangerous as are group passwords. 


e Login directories, .profile files, and files in /bin, /usr/bin, /lbin, and /ete that are writable by users 
other than their respective owners are security weak spots; police the system regularly against them. & 


e Remember, no time-sharing system with dial ports is really secure. Do not keep top secret information 
on the system. 


CONSOLE LOGGING 


To ensure proper recording of console operations and system messages, a Receive-Only Printer (ROP) 
should be part of the console configuration. It should be used to shadow all console actions [see tn83(7)]. 


COMMUNICATING WITH THE USERS & 


The directory /usr/news and the news(1) command are provided as a way to get brief announcements to 
users. More pressing items (one-liners) can be entered in the /etc/motd (message of the day) file; motd and (new 
to the user) news items are announced at login time. 


To reach users who are already logged in, use the wall(1M) (write all) command. Do not use wall while 
logged in as superuser except in emergencies. 


The /usr/news directory should be cleaned out once a week by removing everything older than 2 months. 
It has been noted that on most systems a file in /usr/news will reach 50 percent of the users within a day and 
over 80 percent within a week; motd should be cleaned out daily. 


TROUBLESHOOTING Ss 


It would be easy to write a book on troubleshooting. The following is some effective advice in dealing with 
troubles. 


In dealing with the hardware support services personnel: 


e Keep on top of troubles or problems. Remember that an unreported problem is getting no attention or 
priority. If a problem persists, escalate it through the local management chain; it may also be effective 
to complain to the local service or sales representative. The Western Electric support services offering 
includes automatic escalation of problems. If these service offerings are subscribed to, make sure the 
proper escalation procedures are followed. 


e For effective service, an extended-period support service offering (e.g., 16 hours/day, 6 days/week) 
should be provided. Arrange for preventive maintenance, noncritical repair, and add-on (growth) instal- 
lation work to be done before or after prime time. 


e Know the details of the support service offering applicable to the installation. In particular, make sure 
that preventive maintenance is scheduled in advance and that it is completed. 


e A “site log” should be maintained for the hardware. All troubles should be recorded in the log by the 
support service personnel and/or by the operating personnel. 


e ‘Run error logging and maintain console sheets. Make sure error messages are shown to support service 
personnel. 


e Take core dumps after system crashes and interpret results for support service personnel. © 
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a e Keep records of downtime and make sure that support service personnel know about them. 


Telephone problems are most apt to occur when rearranging or adding equipment. Occasionally, central of- 
fice, trunking, and modem failures occur. In dealing with the telephone services vendor: 


& e Be specific with repair operators. Tell the operators that the trouble involves data equipment. 


e If the first trouble report (call) fails to get results, ask for the “supervisor” on the second call and, if 
necessary, escalate further to get the trouble resolved. 


Some of the obvious problem areas are: 


e Disk Drives—Over 50 percent of the problems are likely to be related to the disk subsystem. As men- 
tioned earlier, the way to keep the system up is to have a spare disk drive. Remember that preventive 
age maintenance of disk drives is very important. Make sure that the support service personnel who service 
the hardware see the error-logging printouts and console error messages produced by UNIX (and that 
the service personnel understand them). Disk failure can ruin a UNIX file system. The only defense is 
to make’a complete, daily file backup! (See “Protecting User Files”’.) 


e Dial Ports—In the dial-in interface area, as well as in the area of synchronous data interfaces, there 
is room for finger-pointing among all involved vendors. Check for obvious things such as is the system 
in “multiuser” mode, is the /etc/inittab file OK, or are any cables loose (both ends)? In some telephone 
offices, trunk hunting is based on 10-number groups. Hunting between such groups can fail indepen- 
dently of anything else. The possibilities for trouble are many. Table 3.B attempts to descibe some alter- 
natives; it is meant primarily for users of the TN4 asynchronous devices. As an example of the format, 
(vertical) Rule 3 reads: “If line rings and ring light shows and computer does not answer and switching 

& the modem solves the problem, then it is likely to be a telephone company problem; also, busy out that 
line.” 


e Synchronous Ports—High-speed synchronous interface devices are even more trouble than dial equip- 
ment. The following is a list of potential trouble spots: 


—UNIX system software. 
—lInterface device (e.g., UN53). 
—Cable to the modem. 

—The modem. 

—The communications line. 
—Other modem. 

—Other cable. 


@ —Other interface device. 
—Other system’s software. 


Think of the finger-pointing possibilities. The best defense is a good line monitor. 
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" TABLE 3.8 


ASYNCHRONOUS LINE PROBLEMS 


CONDITION: 
Line rings 
Ring light shows on telephone console 
Computer answers 
Login message received on terminal 
Switching modem solves problem 
User can login 


KKK 


Telephone console shows data received 
Problem affects whole TN4 (up to 8 lines) 


Zr iz | 


DIAGNOSIS AND/OR ACTION: 
No problem 
Processor hardware problem likely 
Telephone problem likely 
May be a problem with user’s terminal 
Busy out bad line(s) 


DATA SET OPTIONS 


The following data set options seem to work with the UNIX system: 


The 801C-L1 (Auto-Call Unit): 
Jumpers: 
E2to E3 
E6 to E5 


Options: 
ee as Wea 5 
ZG, ZP, G, 
R, ZT 


Switches (0 = open, 1 = closed, i.e., side next to number is down): 
S1=1000[1] (Bracketed switches are missing on some models.) 


52= 0101 
S3 = 11010 
$4 = 11[00] 
The 212A-L1 (1200-baud full duplex): 
Options: 
E, ZF, YF, YC, 
VG. Yoo K, 
5, V, A, T, ZH, 
W, YP, YR 
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ea Switches: 
S1 = [0]001 


52 = 110001000 
S3= 11110000 = (10100000 on 212AR-L1) 
S5 = 00 


e NULL MODEM WIRING 


Improperly wired null modems can cause spurious interrupts, especially at higher baud rates. A single bad 
modem on a 9600-baud line can waste 15 percent of the CPU power. The following (symmetrical) wiring plan 
will prevent such problems: 


pin 1 tol 
ee. pin 2 to 3 
pin 3 to 2 
strap pin 4 to 5 in the same plug 
pin 6 to 20 
pin 7 to 7 
pin 8 to 20 
pin 20 to 6 and 8 
ground unused pins. 


A FEW WORDS ABOUT ACU WIRING 
In rack-mounted 801/212s, cabling is fairly simple: a TN75B cable is plugged into an 801 (the rack provides 


the connection from the leftmost 801 to the leftmost 212) and the TN4 cable goes to the 212. Refer to the section 
& “AUTO CALL FACILITY INSTALLATION” for more information. 
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NOTES 
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